Introduction to Neural Nets - part 1
This article appeared in the July 1999 issue of the Robot Builder.
by Arthur Ed LeBouthillier
Neural Nets excite the imagination of many as an alternative way of creating smart processing systems for use in robots. They offer the possibility of creating intelligent
systems without all of the requirements of traditional programming plus they offer the
added benefit of being able to be run simultaneously on multiple computers, thus
better utilizing parallel processing.
The following is a brief introduction to neural nets. First, we will examine a few of
the characteristics of animal neurons, then we will examine a model of them. In a future article, we will review a learning algorithm which can be applied to this model.
Animal Nervous Systems
Animal brains are composed of numerous specialized processing cells called neurons plus other support cells such as glia. The glia are thought to aid in the distribution of nutrients and space fillers between neurons. The neurons provide the processing elements which are interconnected to provide paths for signal flow.
There are many different types of neurons which perform varying functions when combined in circuits.
Figure 1 provides a rough overview of the structure of a neuron. For the most part, a
signal flows from dendrites through the soma to the axon. In real life, there are
potential fields which can be created between closely spaced cells which also
affect the cell’s firing.
A neuron generally exhibits an all-or-nothing behavior whereby the cell is either at a
resting state or else in a pulsed firing state. For some neurons, there is a steady firing rate which is dependent upon the sum of the inputs; other neurons don’t fire unless a certain level has been attained on the inputs.
Another characteristic which real neurons have is a certain amount of propogation
delay between the input and output. Because of propagation delays, some kinds of processing can occur in the dendrites as coinciding signals add to or subtract from each other.
The Neural Net Model
After studying the way that neurons work, various models have been created to explain them. Some of them are very complicated, such as cable or compartment models, whereas others are simpler.
One of the simpler models is the Neural Net model (as opposed to Neuronal Nets). The neural net model simulates each cell as a simple summation of the inputs times a weight going through a non-linear function called a sigmoid function.
Figure 2 illustrates this model. There are the inputs (i1,i2,…,in) with weight values (w1,w2,…,wn), a summation of the inputs (S) and a sigmoid function (¦).
Mathematically, they are related by the equation:
output = f( i1 * w1 + i2 * w2 + … + in * wn )
The input values can range from 0 to 1, the weight values range from -1 to 1 and the output varies from 0 to 1. Excitatory inputs use positive valued weights, inhibitory inputs use negative valued weights. Notice that this model does not include any
representation of the delay inherent in a real neuron; however, discrete delays can be modeled by using a single-input neuron with a unity weight. This will provide a discrete delay equal to the time it takes to process one neuron.
The Sigmoid function
So far, the model looks pretty simple: all we need to do is multiply each input with its associated weight value, summing all of these individual results together. Once we have this sum, we then need to run it through the sigmoid function. There are a
number of different sigmoid functions used. Whichever function is used, they all have the same characteristic. They generally exhibit an “s” shaped function as demonstrated in figure 3.
Figure 3 illustrates a function which stays low throughout a wide portion of its range and which rather suddenly, but continuously, jumps to a positive 1 value. This extreme non-linearity at the transition point models the all-or-none characteristic
of real neurons. If the sum of the inputs times their weights reaches the transitition value, then the output quickly jumps up to the full activation value.
As was said earlier, a number of sigmoid functions exist. Figure 4 illustrates one typical one.
This function will provide the sigmoid shape needed for the output. It takes as input the sum of the weights times the inputs and produces an output which varies between 0 and 1. The constant e is the engineering constant, 2.71828.
The Multi-Layer Perceptron
Now that we have a simple model of a neuron, we can construct more neurons into basic circuits. One of the more important structures that can be created is the Multi-Layer Perceptron (MLP). The multi- layer perceptron has been shown to be adequate for nearly any purpose if you use enough neurons. Figure 5 shows a simple example of a multi-layer perceptron.
A multi-layer perceptron consists of an input layer, a hidden layer and an output layer. It is able to perform basically any functional relationship between its inputs and outputs. For example, you could use a multi-layer perceptron in optical character recognition; the MLP could be taught to relate the video pixels surrounding a picture of an alphabetic letter to the ascii code representing it.
Implementing the Simple Neural Net Model
Using neural nets once one knows the weight values is simple. The primary operation in the neural net math is a multiply and add; we multiply the weight
times the input value and add that to the running sum. In the following code example, we see that the operation is very simple.
‘ ---------------------------------------------
‘ A simple 3X3 multi-layer perceptron
‘ This program assumes that the values
‘ of the weights have already been put
‘ into the weight arrays
‘ Get the input values
input_layer(1) = input_1
input_layer(2) = input_2
input_layer(3) = input_3
‘ calculate the hidden layer values
for hiddenCell = 1 to 3
sum = 0
for weightCnt = 1 to 3
sum = sum + hidden_weight(hiddenCell,weightCnt) * input_layer(WeightCnt)
next weightCnt
hidden(I) = sigmoid(sum)
next hiddenCell
‘ calculate the output values
for outputCell = 1 to 3
sum = 0
for weightCnt = 1 to 3
sum = sum + output_weight(outputCell,weightCnt) * hidden(weightCnt)
next weightCnt
output(outputCell) = sigmoid(sum)
next outputCell
‘ send the output values
output_1 = output(1)
output_2 = output(2)
output_3 = output(3)
This program is quite simple, able to map three input values to three output values. However, as it exists, the program will not work because there are no values in the weights.
Learning Algorithms
How do we get the weight values? This requires a learning algorithm. We must engage in a training session with the neural net whereby we provide the input samples and train the network to get the proper output values. To do this requires an
algorithm that compares the actual output with the desired output for a given input and determines the proper values for the weights.
There are numerous learning algorithms, which determine the weights and connections, such as Hebb’s learning algorithm, the back-propagation algorithm and others. Each of these algorithms, as part of a learning session, will produce the weight values necessary to make a neural net work properly for its training set and similar inputs. There are also algorithms that are able to learn without being told
what their results should be.
Next Month
We’ll look at a simple learning algorithm allowing you to teach a neural net to do something useful.
Here is the link to "Introduction to Neural Nets - part 2"